-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-33169][SQL][TESTS] Check propagation of datasource options to underlying file system for built-in file-based datasources #30067
Conversation
@cloud-fan @HyukjinKwon Please, take a look at it when you have time. |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #129896 has finished for PR 30067 at commit
|
...re/src/test/scala/org/apache/spark/sql/execution/datasources/CommonFileDataSourceSuite.scala
Show resolved
Hide resolved
import org.apache.spark.sql.catalyst.plans.SQLHelper | ||
import org.apache.spark.sql.test.SQLTestData | ||
|
||
// The trait contains tests for all file-based data sources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some comments here about FileBasedDataSourceSuite
vs CommonFileDataSourceSuite
by linking properly?
For example, we can say like: CommonFileDataSourceSuite
requires to run the same codes in all file based data sources. FileBasedDataSourceSuite
is a more loose place where we put the tests with some variants/differences.
We should also better mention that libsvm is not included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I included libsvm as well.
|
||
// The trait contains tests for all file-based data sources. | ||
trait CommonFileDataSourceSuite { | ||
self: QueryTest with AnyFunSuite with SQLTestData with SQLHelper => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just simply self-type QueryTest
alone without other traits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requiring QueryTest
is too much. I made it more simple.
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #129958 has finished for PR 30067 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #129960 has finished for PR 30067 at commit
|
...re/src/test/scala/org/apache/spark/sql/execution/datasources/CommonFileDataSourceSuite.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good otherwise.
Kubernetes integration test starting |
Test build #129987 has finished for PR 30067 at commit
|
Kubernetes integration test status success |
retest this please |
I am going to merge this. The tests all passed at https://github.com/apache/spark/runs/1271560150, and the last change is only comments. |
Merged to master. |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #129996 has finished for PR 30067 at commit
|
What changes were proposed in this pull request?
CommonFileDataSourceSuite
with tests that can be executed for all built-in file-based datasources.CommonFileDataSourceSuite
to check that datasource options are propagated to underlying file systems as Hadoop configs.CommonFileDataSourceSuite
toAvroSuite
,OrcSourceSuite
,TextSuite
,JsonSuite
, CSVSuiteand to
ParquetFileFormatSuite`.AvroSuite
and fromOrcSourceSuite
.Why are the changes needed?
To improve test coverage and test all built-in file-based datasources.
Does this PR introduce any user-facing change?
No
How was this patch tested?
By running the affected test suites.